Robust smooth segmentation approach for array CGH data analysis
نویسندگان
چکیده
MOTIVATION Array comparative genomic hybridization (aCGH) provides a genome-wide technique to screen for copy number alteration. The existing segmentation approaches for analyzing aCGH data are based on modeling data as a series of discrete segments with unknown boundaries and unknown heights. Although the biological process of copy number alteration is discrete, in reality a variety of biological and experimental factors can cause the signal to deviate from a stepwise function. To take this into account, we propose a smooth segmentation (smoothseg) approach. METHODS To achieve a robust segmentation, we use a doubly heavy-tailed random-effect model. The first heavy-tailed structure on the errors deals with outliers in the observations, and the second deals with possible jumps in the underlying pattern associated with different segments. We develop a fast and reliable computational procedure based on the iterative weighted least-squares algorithm with band-limited matrix inversion. RESULTS Using simulated and real data sets, we demonstrate how smoothseg can aid in identification of regions with genomic alteration and in classification of samples. For the real data sets, smoothseg leads to smaller false discovery rate and classification error rate than the circular binary segmentation (CBS) algorithm. In a realistic simulation setting, smoothseg is better than wavelet smoothing and CBS in identification of regions with genomic alterations and better than CBS in classification of samples. For comparative analyses, we demonstrate that segmenting the t-statistics performs better than segmenting the data. AVAILABILITY The R package smoothseg to perform smooth segmentation is available from http://www.meb.ki.se/~yudpaw.
منابع مشابه
P-112: PGS-Array-CGH Technique: New Technical Approach to Promotion ART Outcome
Background Chromosomal abnormalities are common in embryos from assisted reproductive technology, ranging from 60% abnormal embryos in women
متن کاملStochastic segmentation models for array-based comparative genomic hybridization data analysis.
Array-based comparative genomic hybridization (array-CGH) is a high throughput, high resolution technique for studying the genetics of cancer. Analysis of array-CGH data typically involves estimation of the underlying chromosome copy numbers from the log fluorescence ratios and segmenting the chromosome into regions with the same copy number at each location. We propose for the analysis of arra...
متن کاملSW-ARRAY: a dynamic programming solution for the identification of copy-number changes in genomic DNA using array comparative genome hybridization data
Comparative genome hybridization (CGH) to DNA microarrays (array CGH) is a technique capable of detecting deletions and duplications in genomes at high resolution. However, array CGH studies of the human genome noting false negative and false positive results using large insert clones as probes have raised important concerns regarding the suitability of this approach for clinical diagnostic app...
متن کاملFACADE: a fast and sensitive algorithm for the segmentation and calling of high resolution array CGH data
The availability of high resolution array comparative genomic hybridization (CGH) platforms has led to increasing complexities in data analysis. Specifically, defining contiguous regions of alterations or segmentation can be computationally intensive and popular algorithms can take hours to days for the processing of arrays comprised of hundreds of thousands to millions of elements. Additionall...
متن کاملP-243: Prenatal Diagnosis Using Array CGH: Case Presentation
Background: Karyotype analysis has been the standard and reliable procedure for prenatal cytogenetic diagnosis since the 1970s. However, the major limitation remains requirement for cell culture, resulting in a delay of as much as 14 days to get the test results.CGH array technology has proven to be useful in detecting causative genomic imbalances or genetic mutations in as many as 15% of child...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Bioinformatics
دوره 23 18 شماره
صفحات -
تاریخ انتشار 2007